Search CORE

124 research outputs found

Further improvements of Steiner tree approximations

Author: Karpinksi M.
Zelikovsky A.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1994
Field of study

The Steiner tree problem requires to find a shortest tree connecting a given set of terminal points in a metric space. We suggest a better and fast heuristic for the Steiner problem in graphs and in rectilinear plane. This heuristic finds a Steiner tree at most 1.757 and 1.267 times longer than the optimal solution in graphs and rectilinear plane, respectively

MPG.PuRe

Area fill synthesis for uniform layout density

Author: A. Zelikovsky
A.B. Kahng
G. Robins
Yu Chen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Filling algorithms and analyses for layout density control

Author: A. Singh
A. Zelikovsky
A.B. Kahng
G. Robins
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Efficient error correction and haplotypes reconstruction for deep sequencing of hepatitis c amplicons

Author: Campo D.
Dimitrova Z.
Forbi J.
Khudyakov Y.
Rossi L.
Skums Р.
Vaughan G.
Yokosawa J.
Zelikovsky A.
Publication venue: БГУ
Publication date
Field of study

Секция 1. Защита информации и компьютерный анализ данныхWe present two new highly efficient pyrosequencing error correction algorithms: (i) k-mer – based error correction (KEC); and (ii) empirical frequency threshold (ET). Both were compared to the recently published clustering algorithm SHORAH to evaluate the relative performance using 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. We found that all three algorithms showed similar performance in terms of finding true haplotypes, but KEC and ET methods significantly outperformed SHORAH both in terms of their ability to remove false haplotypes and to estimate the frequency of true ones

A PTAS for planar group Steiner tree via spanner bootstrapping and prize collecting

Author: Bateni M.
Bateni M.
Berman P.
Demaine E. D.
Demaine E. D.
Fomin F. V.
Gudmundsson J.
Hougardy S.
Mitchell J. S. B.
Reich G.
Zelikovsky A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We present the first polynomial-time approximation scheme (PTAS), i.e., (1 + ϵ)-approximation algorithm for any constant ϵ > 0, for the planar group Steiner tree problem (in which each group lies on a boundary of a face). This result improves on the best previous approximation factor of O(logn(loglogn)O(1)). We achieve this result via a novel and powerful technique called spanner bootstrapping, which allows one to bootstrap from a superconstant approximation factor (even superpolynomial in the input size) all the way down to a PTAS. This is in contrast with the popular existing approach for planar PTASs of constructing lightweight spanners in one iteration, which notably requires a constant-factor approximate solution to start from. Spanner bootstrapping removes one of the main barriers for designing PTASs for problems which have no known constant-factor approximation (even on planar graphs), and thus can be used to obtain PTASs for several difficult-to-approximate problems. Our second major contribution required for the planar group Steiner tree PTAS is a spanner construction, which reduces the graph to have total weight within a factor of the optimal solution while approximately preserving the optimal solution. This is particularly challenging because group Steiner tree requires deciding which terminal in each group to connect by the tree, making it much harder than recent previous approaches to construct spanners for planar TSP by Klein [SIAM J. Computing 2008], subset TSP by Klein [STOC 2006], Steiner tree by Borradaile, Klein, and Mathieu [ACM Trans. Algorithms 2009], and Steiner forest by Bateni, Hajiaghayi, and Marx [J. ACM 2011] (and its improvement to an efficient PTAS by Eisenstat, Klein, and Mathieu [SODA 2012]. The main conceptual contribution here is realizing that selecting which terminals may be relevant is essentially a complicated prize-collecting process: we have to carefully weigh the cost and benefits of reaching or avoiding certain terminals in the spanner. Via a sequence of involved prize-collecting procedures, we can construct a spanner that reaches a set of terminals that is sufficient for an almost-optimal solution. Our PTAS for planar group Steiner tree implies the first PTAS for geometric Euclidean group Steiner tree with obstacles, as well as a (2 + ϵ)-approximation algorithm for group TSP with obstacles, improving over the best previous constant-factor approximation algorithms. By contrast, we show that planar group Steiner forest, a slight generalization of planar group Steiner tree, is APX-hard on planar graphs of treewidth 3, even if the groups are pairwise disjoint and every group is a vertex or an edge

DSpace@MIT

Crossref

SZTAKI Publication Repository

Estimation of alternative splicing isoform frequencies from RNA-Seq data

Author: A Mortazavi
A Oshlack
A Roberts
Alex Zelikovsky
B Jackson
B Langmead
B Li
B Paşaniuc
BE Howard
C Trapnell
C Trapnell
CP Ponting
D Hiller
E Wang
H Jiang
H Richard
I Birol
Ion I Măndoiu
J Bloom
J Clarke
J Eid
J Feng
KD Hansen
M Anton
M Griffith
M Guttman
M Sultan
Marius Nicolae
P Carninci
Serghei Mangul
Team MGC Project
V Lacroix
Y She
Y Surget-Groba
Z Wang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Massively parallel whole transcriptome sequencing, commonly referred as RNA-Seq, is quickly becoming the technology of choice for gene expression profiling. However, due to the short read length delivered by current sequencing technologies, estimation of expression levels for alternative splicing gene isoforms remains challenging. Results In this paper we present a novel expectation-maximization algorithm for inference of isoform- and gene-specific expression levels from RNA-Seq data. Our algorithm, referred to as IsoEM, is based on disambiguating information provided by the distribution of insert sizes generated during sequencing library preparation, and takes advantage of base quality scores, strand and read pairing information when available. The open source Java implementation of IsoEM is freely available at <url>http://dna.engr.uconn.edu/software/IsoEM/</url>. Conclusions Empirical experiments on both synthetic and real RNA-Seq datasets show that IsoEM has scalable running time and outperforms existing methods of isoform and gene expression level estimation. Simulation experiments confirm previous findings that, for a fixed sequencing cost, using reads longer than 25-36 bases does not necessarily lead to better accuracy for estimating expression levels of annotated isoforms and genes.</p

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Dagstuhl Research Online Publication Server

Efficient error correction for next-generation sequencing of viral amplicons

Author: A Gilles
Alex Zelikovsky
B Georgescu
C Quince
D Comaniciu
D Comaniciu
D Eckels
David S Campo
F Lopez-Labrador
G Wang
Gilberto Vaughan
H Wang
Jonny Yokosawa
Joseph C Forbi
L Salmela
L Van Doorn
Livia Rossi
M Alter
M Chaisson
M Chaisson
M Chaisson
M Isaguliants
M Larkin
Mathworks
N Pavio
O Zagordi
P Pevzner
P Simmonds
Pavel Skums
Q Choo
S Ramachandran
X Zhao
Yury Khudyakov
Zoya Dimitrova
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Next-generation sequencing allows the analysis of an unprecedented number of viral sequence variants from infected patients, presenting a novel opportunity for understanding virus evolution, drug resistance and immune escape. However, sequencing in bulk is error prone. Thus, the generated data require error identification and correction. Most error-correction methods to date are not optimized for amplicon analysis and assume that the error rate is randomly distributed. Recent quality assessment of amplicon sequences obtained using 454-sequencing showed that the error rate is strongly linked to the presence and size of homopolymers, position in the sequence and length of the amplicon. All these parameters are strongly sequence specific and should be incorporated into the calibration of error-correction algorithms designed for amplicon sequencing. Results In this paper, we present two new efficient error correction algorithms optimized for viral amplicons: (i) k-mer-based error correction (KEC) and (ii) empirical frequency threshold (ET). Both were compared to a previously published clustering algorithm (SHORAH), in order to evaluate their relative performance on 24 experimental datasets obtained by 454-sequencing of amplicons with known sequences. All three algorithms show similar accuracy in finding true haplotypes. However, KEC and ET were significantly more efficient than SHORAH in removing false haplotypes and estimating the frequency of true ones. Conclusions Both algorithms, KEC and ET, are highly suitable for rapid recovery of error-free haplotypes obtained by 454-sequencing of amplicons from heterogeneous viruses. The implementations of the algorithms and data sets used for their testing are available at: <url>http://alan.cs.gsu.edu/NGS/?q=content/pyrosequencing-error-correction-algorithm</url></p

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Inferring viral quasispecies spectra from 454 pyrosequencing reads

Author: A Sundquist
Alex Zelikovsky
AR Quinlan
B Gaschen
Bassam Tork
D Brinza
DC Douek
E Domingo
E Martinez-Salas
EA Duarte
G Myers
H Fakhrai-Rad
Ion Măndoiu
Irina Astrovskaya
JC de la Torre
JC Venter
JI Esteban
JJ Holland
JW Drake
K Westbrooks
Kelly Westbrooks
M Eigen
M Margulies
MC Prosperi
MJ Chaisson
N Beerenwinkel
N Eriksson
NM Laird
O Zagordi
O Zagordi
Peter Balfe
R Lippert
S Balser
S Hoffmann
S-Y Rhee
Serghei Mangul
SL Fishman
ST O’Neil
T von Hahn
V Bansal
W Brockman
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background RNA viruses infecting a host usually exist as a set of closely related sequences, referred to as quasispecies. The genomic diversity of viral quasispecies is a subject of great interest, particularly for chronic infections, since it can lead to resistance to existing therapies. High-throughput sequencing is a promising approach to characterizing viral diversity, but unfortunately standard assembly software was originally designed for single genome assembly and cannot be used to simultaneously assemble and estimate the abundance of multiple closely related quasispecies sequences. Results In this paper, we introduce a new Viral Spectrum Assembler (ViSpA) method for quasispecies spectrum reconstruction and compare it with the state-of-the-art ShoRAH tool on both simulated and real 454 pyrosequencing shotgun reads from HCV and HIV quasispecies. Experimental results show that ViSpA outperforms ShoRAH on simulated error-free reads, correctly assembling 10 out of 10 quasispecies and 29 sequences out of 40 quasispecies. While ShoRAH has a significant advantage over ViSpA on reads simulated with sequencing errors due to its advanced error correction algorithm, ViSpA is better at assembling the simulated reads after they have been corrected by ShoRAH. ViSpA also outperforms ShoRAH on real 454 reads. Indeed, 7 most frequent sequences reconstructed by ViSpA from a real HCV dataset are viable (do not contain internal stop codons), and the most frequent sequence was within 1% of the actual open reading frame obtained by cloning and Sanger sequencing. In contrast, only one of the sequences reconstructed by ShoRAH is viable. On a real HIV dataset, ShoRAH correctly inferred only 2 quasispecies sequences with at most 4 mismatches whereas ViSpA correctly reconstructed 5 quasispecies with at most 2 mismatches, and 2 out of 5 sequences were inferred without any mismatches. ViSpA source code is available at <url>http://alla.cs.gsu.edu/~software/VISPA/vispa.html</url>. Conclusions ViSpA enables accurate viral quasispecies spectrum reconstruction from 454 pyrosequencing reads. We are currently exploring extensions applicable to the analysis of high-throughput sequencing data from bacterial metagenomic samples and ecological samples of eukaryote populations.</p

Crossref

ScholarWorks @ Georgia State University

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Transcriptome assembly and quantification from Ion Torrent RNA-Seq data

Author: A Mortazavi
A Roberts
A Roberts
Adrian Caciula
AI Tomescu
Alex Zelikovsky
AM Mezlini
B Li
B Li
C Gregg
C Ponting
C Trapnell
C Trapnell
CJ McManus
Consortium MAQC
DR Bentley
Dumitru Brinza
E Wang
G Robertson
Ion Mӑndoiu
J Duitama
J Feng
JF Degner
JM Rothberg
KF Au
L Song
LH Reid
M Garber
M Grabherr
M Griffith
M Guttman
M Nicolae
PA Pevzner
R Tibshirani
RK Thomas
S Mangul
S Mangul
S Pal
Sahar Al Seesi
Serghei Mangul
TR Mercer
V Pandey
W Li
W Li
YY Lin
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref